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encoding genes or by methods that activate a large battery of 
enaperone molecules in the cell. 

The second is the realization of a "true ,, and robust secretion 
mechanism for the efficient release of protein into the culture 
medium. There are several available systems that facilitate 
secretion ot recombinant proteins into the culture medium 
*>me o( these are based on the use of signal peptides, fusion 
partners and permeabilizing agents that cause disruption and 
limned leakage of the outer membrane. Other efforts are di- 
rected at pirating existing secretion pathways that promise 
greater specificity of secretion. Work in this area will necessi- 
tate an improved understanding of the various secretion path- 
ways in E. coli. K 

The third is the endowment of the prokarvolic cell with the 
ability to perform some of the posttranslational modifications 
found in eukaryoiic proteins, such as glvcosylation. This might 
be done by the engineering of cukaryotic glycosylating enzymes 
into the E. coli chromosome. 
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culture medium (reference 284 and references therein). Nutri- 
ent composition and fermentation variables such as tempera- 
lure. pH and other parameters can affect proteolytic activity 
secretion, and production levels (24, 25. 153 324 338 614) 
Specific manipulations of the culture medium have been 'shown 
to enhance protein release into the medium. Thus, supplemen- 
tation of the growth medium with glycine enhances the release 
ot periplasm* proteins into the medium without causing sig- 
nificant cell lysis (10, 13). Similarly, growth of cells under 
osmotic stress in the presence of sorbitol and glycyl betaine 
causes more than a 400-fold increase in the production of 
soluble, active protein (49). 

High-cell-density culture systems suffer from several draw- 
backs, mcluding limited availability of dissolved oxygen at high 
cell density, carbon dioxide levels which can decrease growth 
rates and stimulate acetate formation, reduction in the mixing 
efficiency of the fermentor, and heat generation. The tech- 
niques that are used to minimize such problems have been 
examined in detail (338). A major challenge in the production 
of recomb.nant protein at high cell density is the accumulation 

?w CC »o C, ,«! P ^ phi,ic agent ,hat is detri ™ntal to cell growth 
(265, 338, 353). A number of strategies have been developed to 
reduce acetate formation in high-cell-density cultures but 
these suffer from several drawbacks (338). This problem was 
recently resolved through the metabolic engineering of E. coli 
(ll. 1_, 479). The alsS gene from B. subtilis encoding the 
enzyme acetolactate synthase was introduced into E. coli cells 
I his enzyme catalyzes the conversion of pyruvate to nonacidic 
and less toxic byproducts. The reduction in acetate accumula- 
tion caused a significant improvement in the production of 
recombinant protein (12, 479). Mutant strains of E. coli that 
are deficient in other enzymes have also been developed and 
shown to produce less acetate and higher levels of human 
recombinant proteins (30, 103, 273). 

CONCLUSIONS AND FUTURE DIRECTIONS 

An efficient prokaryotic expression vector should contain a 
strong and tightly regulated promoter, an SD site that is posi- 
tioned approximately 9 bp 5' to the translation initiation codon 
and is complementary to the 3' end of 16S rRNA, and an 
emcient transcription terminator positioned 3' to the gene 
coding sequence. In addition, the vectors require an origin of 
replication, a selection marker, and a gene that facilitates the 
stringent regulation of promoter activity. This regulatory ele- 
ment may be integrated either in the vector itself or in the host 
chromosome. Other elements that mav be beneficial include 
transcriptional and translational "enhancers." as well as "mi- 
nicistrons" m translationally coupled svstems. These may be 
gene specific: therefore, their utility must be tested case bv 
case. The translational initiation region of a gene must be free 
of secondary structures that may occlude the initiation codon 
and/or block r.bosome binding. UAAU is the most efficient 
translation termination sequence in E. coli. 

There are many different prokaryotic vectors that allow the 
tight regulation of gene expression. The experimental ap- 
proaches to achieve tight regulation of promoter activitv range 
trom the simple repositioning of the operator in /r/c-based 
systems to the construction of elaborate -cross-regulation" sys- 
tems. These vectors are efficient, and each svstem has its own 
niche in prokaryotic gene expression. The demonstrated effec- 
tiveness of a thermosensitive lac repressor now allows the 
IPTG reSUlati ° n ° f / " c '- based Promoters in lieu of using 

To date there is no generally applicable strategv to prevent 
the degradation of a wide variety of mRNA species in E. coli. 
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Although certain 5' and 3' stem-loop structures have been 
shown to block mRNA degradation, these seem to stabilize 
only specific mRNAs, under restricted conditions. One excep- 
tion appears to be the 5' UTR of the E. coli ompA transcript 
which prolongs the half-life of a number of heterologous 
mRNAs in E. coli. The use of strains deficient in specific 
RNases has been ineffective for enhanced gene expression. 

bach of the four "compartments" for targeted protein pro- 
duction, ..c., the cytoplasm, periplasm, inner and outer mem- 
branes and growth medium, offers advantages and disadvan- 
tages for gene expression, depending on the experimental 
objectives. The formation of inclusion bodies can be minimized 
by a variety of techniques, but it remains a significant barrier to 
high-level protein production in the cytoplasm. To date the 
effectiveness of molecular chaperones has been protein 'spe- 
cific. It is possible that this is due to conditions that prevent the 
formation of a thermodynamically stable end product such as 
the production of severely truncated proteins or single do- 
mains from multisubunit protein complexes, lack of formation 
of disulfide bonds, suboptimal growth conditions, absence of 
postradiational modifications, and the normally concerted 
action of multiple types of chaperones in vivo. Nevertheless 
molecular chaperones have been used very successfully for the 
enhanced production of specific proteins. 

The wide variety of existing fusion partners have utility in 
the production, detection, and purification of recombinant 
proteins. Specific fusion moieties can increase the folding sol- 
ubility, resistance to proteolysis, and secretion of recombinant 
proteins into the growth medium. 

Protein misfolding, attributed to the intracellular concentra- 
tion of aggregation-prone intermediates, may be minimized by 
a combination of experimental approaches: replacement of 
amino acid residues that cause aggregation, coexpression of 
molecular chaperones and foldases, reduction of the rate of 
protein synthesis, the use of solubilizing fusion partners, and 
the careful optimizalion of growth conditions. 

Codon usage can have adverse effects on the synthesis and 
yield of recombinant proteins. However, the mere presence of 
rare codons in a gene does not necessarily dictate poor 
translation of that gene. Currently, we do not know all the rules 
that link codon usage and translation of a transcript. The lack 
of consistent results in the published literature on codon usage 
may be due to several variables, such as positional effects the 
clustering or interspersion of the rare codons, secondary struc- 
ture of the mRNA, and other effects. Positional effects appear 
to play an important role in protein synthesis. Thus, the pres- 
ence of rare codons near the 5' end of a transcript probably 
affects translational efficiency. This problem may be rectified 
by the alteration of the culprit codons, and/or the coexpression 
of the cognate tRNA genes. 

Much progress has been made in the elucidation of specific 
determinants of protein degradation in E. coli. Effective ap- 
proaches for the minimization of proteolysis in E. coli include 
the targeting of protein to the periplasm or the culture me- 
dium the use of protease-deficient host strains, the construc- 
tion of fusion proteins, the coexpression of molecular chaper- 
ones. the coexpression of the T4 pin gene, the elimination of 
protease cleavage sites through genetic engineering, and the 
optimization of fermentation conditions. Host strains that are 
deficient in the rp,,H (lupR) locus are among the best, partic- 
ularly for thermally induced expression systems 

Future ch^lenges in the use of E. coli for gene expression 
will involve the following factors. The first is the achievement 
of enhanced yields of correctly folded proteins bv manipulating 
the molecular chapemne machinery of the cell. Perhaps this 
might be done by Hie coexpression of multiple chaperone- 
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was recently discovered (297) and a fascinating new mecha- 
nism , for the degradation of abnormal proteins in £ co// has 
fit' eKsZZr^ 2991 "^^the inteSsfscTe t fi 
m n,w '5? T* h3S 8 enera ' ed new tools and strategies for 
Mh£2h t t degTadat ' ,0n ° f heter °'°gous proteins in£. col. 
Although the precise structural features that impart lability 
lh P ° f te ' ns ar ^ not ^own, some determinants of protein in 
stability have been elucidated. In a series of systematic studies 
Varshavsky and colleagues formulated the "N-end ule" that' 

Lvs Len Phi t 5 ^V 7I) - ThuS ' in £ - co// ' N-lerminal Arg, 
' u ' Tyr '„ a " d Trp conf erred 2-min half-lives on a tes 
protein, whereas all the other amino acids except proline con 
ferred more than 10-h half-lives on the same prLKoO) As 
discussed above (see the section on cytoplasmic express on) 
amino acids with small side chains in the second pos.E o the 
polypeptide facilitate the methionine aminopeptidase cati 
yzed removal of the N-terminal methionine (250) T here e" 
these studies suggest that Leu in the second position would 
probably be exposed by the removal of the methionine rSe 
and would destabilize the protein. e 

inrI!L S f C ° nd det f m i nant of P^tein instability is a specific 
internal lysine residue located near the amino terminus (14 15 
86) This residue is the acceptor of a multiubiquitin chain ihat 
acihtates protein degradation by a ubiquitin Vnden ^pro- 
he rw^/f ^° teS - Interesti "g'y' in a multisubuni, prote n 

rtl ^STT T bC ,OCated °" different subuni ^ ^d 
still target the protein for processing (287) 

^^".correlation between amino acid intent and pro- 
em instability ,s presented in the PEST hypothesis (46i)On 

Eort'hT.n v Stat ' S , t,CaI 3nalySiS ° f euk ^otic proteins\hat have 
short half-lives ,t was proposed that proteins are destabilized 
by regions enriched in Pro, Glu, Ser, and Thr, flanked by 
certain ammo acid residues. Phosphorylation of these PEST 
domams leads to increased calcium binding, which in turn 
fac, hta.es the destruction of the protein by calcium dependen 
E^H ffi WaS r.^ d th at PEST-rich proteins may be 

teinsTnT" mi " imizin 8 Proteolysis of recombinant pro- 
teins in E. col, have been reviewed in detail (25 153 395) and 
are summarized in Table 2. These include protein targeting to 
the periplasm (550) or the culture medium (230), the 2? of 
protease-defic.ent host strains (21 1 ), growth of the host cells a 
ow temperature (100), construction of N- and/or C-term ina 
crT,< (59 ' 23 °' 319 ' 393) ' tandem fusion of muUiJle 

SSS^JSVn*'* (5 ' 2) : Coex P ressio " of molecular 
cnaperones (489, 581), coexpression of the T4 pin gene (519- 
521), replacement of specific amino acid residues to eliminate 

bSrvof.h'r 386 SitCS i2A3) < modifi cation of the hyS'pho 
b.cuy of the arget protein (394), and optimization of fermen- 
tation conditions (24, 338) 

attts'Js" to 8 , h h P th inr rie ^ ° f f a PP roaches for P^tein stabilization 
some o ihl !h g ,ty ,°^ he ,nvesti S at ors. the usefulness of 
KnH^H ft meth ° ds may be limited ' Spending on the 

rToresence off recombinant P^'ein. Thus, for example. 

he presence of fusion moieties on the target protein may 
interfere with functional or structural properties (51) o? the? 
apeutic applications of the product. The engmee ing of enzv 
matic or chemical cleavage sites for the subsequent removaTof 

ius con°,iH Par , tnerS V C ° mpleX pr0cess '"at involves numer- 
ous considera ,ons: the accessibility of the cleavage sites to 
enzyme digestion; the purity, specificity, and cost of the com 
mercially available enzymes; the authenticity of the N or C 
ermini upon enzymatic digestion; the possible modification of 
the target protein upon chemical treatment, and so forth (see. 
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e.g., references 78, 162, 419, and 567). For the large-scale 

amplified. S.milarly, the fusion of multiple copies of the target 
gene to create multidomain polypeptides (512) requires fhe 
ubsequent conversion to monomeric protein units by c^ano- 

?on.a n?nf Cl ? 3V T thiS CaSe ' the tar S et P rote ''n ™t n°t 
°"' ai m " nal methionine residues and must be able to with- 
stand harsh reaction conditions. Moreover, a limited extend of 
amino acd side chain modification may occur, and the SJ 
of cyanogen bromide presents a significant issue for large 3 

o o a ,e a n 8e / eaCt,0nS - Similarly ' ,he rational modification^ of a 
protein sequence requires extensive structural information 
which may not be available. Molecular chaperone have ten 

ZrlT SSM y ,0 u Stab,li2e Spedfic P"**» 095) but this 
approach remains a hit-or-miss affair (581) 

te a TL e t h y ; , t n P H laSni ,K f E - C °'i C0 " tainS a neater number of pro- 
located tr^" thc r nplaSm , (545 ' 546 ^ ^erefore, proteins 
located in the periplasm are less likely to be degraded For 
example proinsulin localized to the periplasm was 10-fo?d 
more stable that when produced in the cytoplasm (550). How 
ever, proteolytic activity in the periplasm is substantia (367) 
Secretion into the culture medium would provide a better 
e h r " a ' Ve f n terms °f Protein stability. Unfortunately, the 
technology for secretion of proteins from E. coli into the cul- 

extrac'el.u.r,!. 5 *?' '"^ (528) < See the section on 

extracellular secretion, above). A major catalyst of protein 

degradation m bacteria is the induction of heat shock P ?oS 

n response to a vanety of stress conditions, such as the therma 

or gCne eXpreSSi ° n ° r the cumulation of abnormal 

or heterologous proteins m the cytoplasm (194). Under these 

St95T\nd e oT dUCt, ° n ° f thC l ° n ^^6na, protease 
La (195) and other proteases is enhanced. This problem is 
m nimized by the use of host strains deficient i n P t he rpoH 

nil ( - l' ' ' 42,) - ThC V 0 "^ e "codes the RNA 
polymerase a 3 - subumt, which regulates several proteolytic 
act lv ,,ies in E. coli (20, 193). Hosts that carry the £3 3a- 

dramai^ (2 ° 2) a " d have ^en demonstrated to 

dramatically mcrease the production of foreign proteins in E 

m'i? ee h e f • r !, fe fi renCeS 4 ' 9 ' 47 ' 70 ' and 373) Strain SG21 ,73 
21 1), which is deficient in proteases La and CIp and the rpoH 
locus is particularly effective in protein production (9). A Eme 

2 n 3 U td r 2°l f P r ° te f ^ defiCient h ° StS CXiStS ^ e-g-'referencS 
±5 and 211) including some that are deficient in all known 
protease loci that affect the stability of secreted proteins (372) 
Before leavmg this section, it is worth repeating a caveat on 
the use of protease-deficient strains (581): proteolysis may be 
an effect rather than a cause of folding problems, sSg L a 

r^S°r° Ve If 01 ?* 3nd ^Sated material 
(238). Therefore, it is possible that the absence of proteases 
will result m increased toxicity to the host as a resuTof the 
accumulation of abnormal proteins. 

FERMENTATION CONDITIONS 

Protein production in E. coli can be increased significantly 
S^.the use ofhjgh^lKden^culture systems, thich Jan 
be classified into thTee groups; b^hTfelI-b^rra^~continu- 

onm ^™ ! M dSXa K f, ChiCVe Ce " concentrations in excess 
oLUKUidQ LceMweightJ/l.ter and can provide cost-effective 
production of recomb-ir^nT^oTeir^Detaiird^iewlTri?f^ 

fc™ 61113 " 00 s y st ems have been published (338, 607 
614)_ The compos.tion of the cell growth medium must be 
carefully form u a ted and monitored, because it may have s£ 
mficant metabolic effects on both the cells and protein produ?- 

IZJ n e T P ' e J u 6 translation of different mRNAs is dif- 
ferentially affected by temperature as well as changes in the 
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TABLE 4. Low- usage codons in E. coli a 



Codon(s) Amino acid 



AGA, AGG, CGA, CGG Arg 

UGU, UGC Cys 

GGA, GGG Gly 

AUA He 

CUA, CUC Leu 

CCC, CCU, CCA Pro 

UCA, AGU,UCG, UCC Ser 

ACA Thr 



" The reported frequency of codon usage varies depending on the author 
(based on references 293, 580, and 623). 



the £. coli periplasm is facilitated by a group of proteins that 
maintain the correct redox potential (26). It is thought that 
DsbA, a soluble periplasmic protein, directly catalyzes disul- 
fide bond formation in proteins whereas DsbB, an inner mem- 
brane protein, is involved in the reoxidation of DsbA (227a). 
Eukaryotic PDI was capable of complementing the phenotypes 
of dsbA null mutants (268, 433a), but its function was virtually 
abolished in dsbB mutants (433a). In addition, the ability of 
PDI to enhance the yield of target proteins was increased in 
the presence of exogenously added glutathione (268, 433a). 
These observations suggest that PDI depends on the presence 
of bacterial redox proteins for its reoxidation. The coexpres- 
sion of rat PDI has also been reported to enhance the correct 
folding of tissue plasminogen activator (433a). 

Protein misfolding can be attributed to the intracellular con- 
centration of aggregation-prone intermediates. Thus, although 
the subject of this review is the maximization of protein syn- 
thesis, reducing the rate of protein synthesis should disfavor 
protein misfolding. Indeed, the use of weaker promoters or 
conditions of partial induction from stronger promoters can 
result in larger amounts of soluble protein (180, 253). 
Kadokura et al. (291) showed that the ability of £. coli mutants 
to secrete a large amount of alkaline phosphatase into the 
periplasm was due to a lower synthetic rate of the phoA gene 
product. 

CODON USAGE 

Genes in both prokaryotes and eukaryotes show a nonran- 
dom usage of synonymous codons (214, 228, 272, 509, 623). 
The systematic analysis of codon usage patterns in £. coli led to 
the following observations (124). (i) There is a bias for one or 
two codons for almost all degenerate codon families, (ii) Cer- 
tain codons are most frequently used by all different genes 
irrespective of the abundance of the protein; for example, 
CCG is the preferred triplet encoding proline, (iii) Highly 
expressed genes exhibit a greater degree of codon bias than do 
poorly expressed ones, (iv) The frequency of use of synony- 
mous codons usually reflects the abundance of their cognate 
tRNAs. These observations imply that heterologous genes en- 
riched with codons that are rarely used by £. coli (Table 4) may 
not be expressed efficiently in £. coli. 

The minor arginine tRNA Are (AGG '' AGA| has been shown to 
be a limiting factor in the bacterial expression of several mam- 
malian genes (62), because the codons AGA and AGG are 
infrequently used in £. coli (91, 95, 214). The coexpression of 
the argU (dnaY) gene that codes for tRNA Arg < AGGAGA > (175, 
343) resulted in high-level production of the target protein 
(62). The production of (3-galactosidase decreased when AGG 
codons were inserted before the 10th codon from the initiation 
codon of the lacZ gene (92). Similarly, Goldman et al. (204) 



reported that translational inhibition of a test mRNA was 
much stronger in both arginine and leucine cases when the 
consecutive low-usage codons were located near the 5' end of 
the mRNA. Ivanov et al. (280) reported that tandem AGG 
triplets caused a substantial inhibition of gene expression in- 
dependent of their localization in mRNA. These workers at- 
tributed the inhibitory effect to a competition of the tandem 
AGGAGG codons with the natural SD sequence. Other stud- 
ies showed that protein production levels could be increased 
either by substitution of high-usage codons for low-usage ones 
(see, e.g., references 3, 70, 135, 145, 248, 262, 383, 452) or by 
coexpression of the "rare" tRNA gene (62, 126). The expres- 
sion of the ICP4 gene from herpes simplex virus was shown to 
be inefficient because of the presence of an almost continuous 
stretch of 19 serine residues (73). The efficiency of ICP4 syn- 
thesis was not improved by silent mutations in this serine-rich 
region, supplementation of the growth medium with serine, 
overexpression of seryl-tRNA synthetase, or expression of 
tRNA . The level of gene expression was inversely propor- 
tional to the number of serine codons in this region (73). 
Although this is certainly an extreme case, it is indicative of the 
adverse effects of long stretches of similar codons on transla- 
tional efficiency. 

In contrast, other workers reported very efficient expression 
of genes that contained low-usage codons (see, e.g., references 
154, 265, 334, 464, and 616). Similarly, in the case of the human 
T-ceil receptor Vp5.3 gene that contains 4% AGA/AGG codons, 
expansion of the intracellular pool of tRNA Arg < AGG/AGA > did 
not significantly increase the amount of V(35.3 detected in the 
cells (9). 

The evolutionary significance of codon usage patterns, as 
well as mechanistic explanations for the effects of codon usage, 
has been advanced by many workers (74, 92, 124, 155, 204, 245, 
276, 293, 463, 474). To date, however, it has not been possible 
to formulate general and unambiguous "rules" to predict 
whether the content of low-usage codons in a specific gene 
might adversely affect the efficiency of its expression in E. coli. 
The experimental results may be confounded by several vari- 
ables, such as positional effects, the clustering or interspersion 
of the rarely used codons, the secondary structure of the 
mRNA, and other effects (204, 293). Nevertheless, from a 
practical point of view, it is clear that the codon context of 
specific genes can have adverse effects on both the quantity and 
quality of protein levels. Usually, this problem can be rectified 
by the alteration of the codons in question, or by the coexpres- 
sion of the cognate tRNA genes. 

PROTEIN DEGRADATION 

Proteolysis is a selective, highly regulated process that plays 
an important role in cellular physiology (200, 203, 378). £. coli 
contains a large number of proteases that are localized in the 
cytoplasm, the periplasm, and the inner and outer membranes 
(25, 199, 201, 212. 367). These proteolytic enzymes participate 
in a host of metabolic activities, including the selective removal 
of abnormal proteins (201. 212). Protein damage or alteration 
may result from a variety of conditions, such as incomplete 
polypeptides, mutations caused by amino acid substitutions, 
excessive synthesis of subunits from multimeric complexes, 
posttranslational damage through oxidation or free-radical at- 
tack, and genetic engineering (201). Such abnormal proteins 
are efficiently removed by the bacterial proteolytic machine. To 
date, the mechanisms of protein degradation are incompletely 
understood, and it is unlikely that all proteolytic pathways or 
enzymes operating in £. coli have been identified yet. For 
example, a new protease associated with the outer membrane 
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Detection 



Antibody 
Antibody 

Biochemical assay, antibody 



Antibody, fluorescent calmodulin ligand 
Antibody 

Biochemical assay, antibody 
Antibody 

Biochemical assay 



Labeled biotin 
Antibody 

Antibodv 




Antibody 
UV light 



TABLE 3— Continued 
Applications 



Purification, detection 
Purification, detection 
Expression, purification, detection 
Expression, purification, detection 
Expression, purification, detection 
Purification, detection 

Expression, purification 

Expression, purification, detection 

Expression 

Purification 

Purification, detection 

Purification 

Expression, purification 
Detection, purification 
Purification, detection 
Purification, detection, assay systems 
Detection, purification 
Purification, detection 
Purification 
Expression 

Purification, refolding 

Purification 

Purification 

Purification, screening peptide librarie 

Expression 
Expression 
Purification 
Purification 
Purification 

Purification, enzyme immobilization 
Secretion into culture medium 

Expression 

In vitro phosphorylation, purification 
Purification 

Purification, reverse epitope tagging* 1 
Detection, purification 

Detection 



References 



63, 259, 313, 446, 540 

251, 252, 578, 619 

101, 167, 225, 226, 462, 524 

411, 477, 568 

148, 230, 329, 418 

403 

330, 351, 591a 

181, 182, 192, 278, 472, 518, 569 
19, 76, 319, 595 
144, 166, 315, 459 
307, 308 

596 
109 

114, 608 
6 

484, 485 
410, 501 
392, 584, 585 
281 
146 

61,487, 488, 533, 534 

436 

436 

115, 177, 347, 354, 491 
215 

176, 271, 365, 380 
34, 136, 358 
555 
244 

431, 432 

303, 357, 528 

398-400 

267, 611 

88, 300 

318 

557 

582 

81,113, 241 



FP^WP^-^xy-D-manno-octiilosonate cyt idyl transferase. 
HAI, influenza virus hemagglutinin. 

' Reverse epitope tagging refers to tagging of the chro.oso.al rather than the plasm id-encoded protein, to avoid the need to remove the fusion partner. 



use of chaperones for gene expression and provided detailed 
and rigorous assessments. This section is a distillation of the 
take-home lessons. 

Normally, protein folding proceeds toward a thermodynam- 
ic* ly stable end product (434, 476). Proteins that are drasti- 
cally destabilized will probably fold incorrectly, even in the 
presence of chaperones. Thus, the truncation of polypeptides 
the production of single domains from multisubunit protein 
complexes, the lack of formation of disulfide bonds which 
ordinarily contribute to protein structure (320. 559) or the 
absence of posttranslational modifications such as glycosyla- 
|ion (116) may make it impossible to attain thermodynamic 
^stability. Moreover, it is now clear that different types of chap- 
erones normally act in concert (69. 327). Therefore, the over- 
production of a single chaperone may be ineffective For ex- 
ample, the overproduction of DnaK alone resulted in plasmid 



instability which was alleviated by the coproduction of DnaJ 
(52). Similarly, the coexpression of three chaperone genes in £. 
coli increased the solubility of several kinases (79). In some 
cases, it may be necessary to coexpress chaperones cloned from 
the same source as the target protein (105). Still another vari- 
able to consider is growth temperature. For example, GroES- 
GroEL coexpression increased the production of B-galactosi- 
dase at 30 but not 37 or 42°C, whereas DnaK and DnaJ were 
effective at all temperatures tested (180). Finally, the overex- 
pression of chaperones can lead to phenotypic changes, such as 
cell filamentation, that can be detrimental to cell viability and 
protein production (52). 

Two recent reports have shown that the coexpression of the 
human (268) or rat (433a) protein disulfide isomerase (PDI) 
with the target gene enhances the yield of correctly folded 
protein in the £. coli periplasm. Disulfide bond formation in 
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TABLE 3. Fusion partners and their applications" 



Fusion partner 



Ligand/mairix 



Purification conditions 



Flag peptide 
His, 

GIutathione-5-transferase 
Staphylococcal protein A 
Streptococcal protein G 
Calmodulin 

Thioredoxin 

0-Galactosidase 

Ubiquitin 

Chloramphenicol acetyltransferase 
S-peptide (RNase A, residues 1-20) 

Myosin heavy chain 
DsbA 

Biotin subunit (in vivo biotinylation) 

Avidin 

Streptavidin 

Strep-tag 

c-myc 

Dihydrofolate reductase 
CKS C 

Polyarginine 
Polycysteine 
Polyphenylalanine 
lac repressor 

T4 gp55 

Growth hormone N terminus 
Maltose-binding protein 
Galactose-binding protein 
Cyclomaltodextrin glucanotransferase 
Cellulose-binding domain 
Hemolysin A, E. coli 
\ ell protein 
TrpE or TrpLE 
Protein kinase site(s) 
(AlaTrpTrpPro),, 
HAl d epitope 

BTag (VP7 protein region of 

bluetongue virus) 
Green fluorescent protein 



Anti-Flag monoclonal antibodies. Ml, M2 
Ni 2+ -nitrilotriacetic acid 
Glutathione-Sepharose 
Immunoglobulin G-Sepharose 
Albumin 

Organic ligands, peptide ligands, DEAE- 

Sephadex 
ThioBond resin 
TPEG*-Sepharose 

Chloramphenicol-Sepharose 
S-protein (RNase A, residues 21-124) 



Biotin 
Biotin 

Streptavidin 
Anti-myc antibody 
Methotrexate-agarose 



S-Sepharose 
Thiopropyl-Sepharose 
Phenyl-Superose 
lac operator 



Amy lose resin 
Galactose-Sepharose 
a-Cyclodextrin-agarose 
Cellulose 



Low calcium, EDTA, glycine 
Imidazole 

Reduced glutathione 
Low pH, IgG-affinity ligand 
Low pH, albumin-affinity ligand 
Low calcium 

Ion exchange 
Borate 

Chloramphenicol 
Denaturing or nondenaturing 

conditions 
Differential solubility in low/high salt 



Denaturation (urea, heat) 
Denaturation (urea, heat) 
2-Iminobiotin, diaminobiotin 

Folate buffer 

NaCl 

Dithiothreitol 
Ethylene glycol 

Lactose analog, DNase, restriction 
endonuclease 



Maltose 
Galactose 
a-Cyclodextrin 
Water 



Aqueous two-phase extraction 



Anti-BTag antibodies 



The fusion of genes to the ubiquitin sequence increased the 
yield of proteins from undetectable to 20% of the total cellular 
protein (76, 595). Similar results have been obtained by many 
other workers (reference 319 and references therein). The 
remarkable increase in protein yield was thought to be due to 
protection of the target protein from proteolysis, improved 
folding, and efficient mRNA translation (76). Ubiquitin or the 
ubiquitin metabolic pathway is absent in prokaryotic organ- 
isms. To remove the ubiquitin moiety from fusion proteins, 
Baker et al. (19) coexpressed the ubiquitin-specific protease 
Ubp2 in £ coli, thus effecting the cotranslational cleavage of 
ubiquitin from the fusion protein. 

MOLECULAR CHAPERONES 

It is now well established that the efficient posttranslational 
folding of proteins, the assembly of polypeptides into oligo- 
meric structures, and the localization of proteins are mediated 
by specialized proteins termed molecular chaperones (33. 69. 
104, 149, 183. 189, 246, 350. 364, 601). The demonstration that 
efficient production and assembly of prokaryotic ribulose 
bisphosphate carboxylase in £. coli require both GroES and 



GroEL proteins (208) led to an increasing interest in the use of 
molecular chaperones for high -level gene expression in £ coli 
(106). The experimental results from the use of chaperones, 
however, have been inconsistent, and thus far the effects of 
chaperone coproduction on gene expression in £. coli appear 
to be protein specific (581). For example, although the 
GroESL plasmids have been disseminated to more that 400 
workers, only half of those who used them reported an im- 
provement in gene expression (350a). This is consistent with 
recent observations that whereas the coproduction of thiore- 
doxin in £. coli caused a dramatic increase in the solubility of 
eight vertebrate proteins, the coproduction of the GroESL 
chaperones increased the solubilities of only four of those 
proteins (613). It is also unclear whether the in vivo levels of 
different chaperone species are limiting under conditions of 
gene overexpression. For example, Knappik et al. (312) exam- 
ined the effect of folding catalysts on the production of anti- 
body fragments in the periplasm. Whereas the presence of the 
disulfide-forming protein DsbA was absolutely required in 
vivo, its overexpression did not increase the yield of antibody 
fragments. Wall and Pliickthun (581) and Georgiou and Valax 
(180) revisited the assumptions and expectations behind the 
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TABLE 2— Continued 



Companmcni property 



Strategy for resolution 



Rcfercnce(s) 



Extracellular medium 
Advantages 
Least level of proteolysis 
Purification is simpler (fewest protein types) 
Improved protein folding 
N-terminus authenticity 

Disadvantages 
No secretion usually 



Protein diluted, more difficult to purify 



Fusions to normally secreted proteins 
Coexprcssion of kil for permeabilization 
Fusion to ompF gene components 
Use of ompA signal sequence 
Use of protein A signal sequence 
Coexprcssion of bacteriocin release protein 
Use of glycine and bacteriocin release protein 
Glycine supplement in medium 
Fusion partners 
Expanded-bed adsorption 
Concentration, affinity chromatography 



309 



303, 528 

296, 309 

396 

316 

256 

261 

617 

10, 13 

384 

231 



Cell surface 

To date not useful for high-level gene expression. May 
facilitate vaccine development, drug screening, 
biocatalysis, protein-protein interactions, and other 
applications 



85, 111, 169, 178, 179, 253, 
345, 352, 405 



profound consequences. For example, the retention of the 
initiating methionine in RANTES, a member of the chemo- 
kine family of cytokines, completely abrogates the physiologi- 
cal activity of this molecule and confers potent antagonist 
properties to the methionylated RANTES (448). Similarly, an 
unnatural N-terminal methionine residue can alter the confor- 
mation of the human hemoglobin molecule (298). Moreover, it 
is possible that the presence of an extra amino acid will change 
the immunological properties of pharmaceutical proteins and 
create difficulties in the approval of a nonnative product for 
clinical use. 

Bacterial translation is initiated by yV-formylmethionine 
which is deformylated during synthesis (2) but not necessarily 
removed. The N-terminal methionine might be cleaved off by 
an endogenous methionine aminopeptidase (39) depending on 
the side chain length of the second amino acid residue (250). 
Thus, residues with small side chains such as Gly, Ala, Pro, Ser, 
Thr, Val, Cys, and, to a lesser degree, Asn, Asp, Leu, and He, 
facilitate the methionine aminopeptidase-catalyzed removal of 
the N-terminal methionine (250). One strategy that has been 
successfully used to remove the extra methionine residue from 
recombinant proteins in vivo is coexpression of the E. coli 
methionine aminopeptidase gene (483, 513). An alternative 
method for the in vitro generation of an authentic N terminus 
uses the exopeptidase dipeptidylaminopeptidase I. This en- 
zyme removes dipeptides from the N terminus but cannot 
cleave peptide bonds containing a proline residue. Dalboge et 
al. (117) produced human growth hormone containing an ami- 
no-terminal extension which was subsequently removed with 
dipeptidylaminopeptidase I to yield authentic growth hor- 
mone. This approach requires an amino-terminal extension 
that contains an even number of amino acid residues and is 
designed so that it enables the in vivo excision of the N- 
terminal methionine. In addition, the second or third amino 
acid residue in the target protein must be proline (117). A 
more elaborate method free of the above restrictions has been 



proposed to generate an authentic N terminus for any protein 
(117). The cotranslational amino-terminal processing in both 
prokaryotes and eukaryotes has been reviewed (301). 

Protein degradation is more likely to occur in the cytoplasm 
of £. coli than in other compartments (550) because of the 
greater number of proteases located there (545, 546). This 
topic is examined in the section on protein degradation (be- 
low). Finally, another difficulty that affects cytosotic gene ex- 
pression is the need to purify the target protein from the pool 
of the intracellular proteins. Calculations based on total DNA 
content predict that the E. coli chromosome may encode 3,000 
to 4,000 genes (547), although not all of these are expressed 
under given growth conditions. 

Periplasmic Expression 

The periplasm offers several advantages for protein target- 
ing. In contrast to the cytosolic compartment, the periplasm 
contains only 4% of the total cell protein (416) or approxi- 
mately 100 proteins (450). The target protein is thus effectively 
concentrated, and its purification is considerably less onerous. 
The oxidizing environment of the periplasm facilitates the 
proper folding of proteins, and the cleaving in vivo of the signal 
peptide during translocation to the periplasm is more likely to 
yield the authentic N terminus of the target protein. Protein 
degradation in the periplasm is also less extensive (550). 

The transport of a protein through the inner membrane to 
the periplasm normally requires a signal sequence (376, 490, 
492. 575-577, 589). A wide variety of signal peptides have' been 
used successfully in £. coli for protein translocation to the 
periplasm. These include prokaryotic signal sequences, such as 
the E. coli PhoA signal (127. 424), OmpA (127, 185, 205, 263 
339), OmpT (286). LamB and OmpF (255), p-lactamase(292 
574). enterotoxins ST-II. LT-A. LT-B (171, 388), protein A 
from Staphylococcus aureus (1, 256). endoglucanase from B. 
subtilis (348). PelB from Erwinia carotovora (44, 340), a degen- 
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TABLE 2. Relative merits of different compartments for gene expression in E. coli and strategies for the potential 

resolution of experimental problems 



Compartment property 



Stratccv for resolution 



Reference(s) 



Cytoplasm 
Advantages 

Inclusion bodies: facile isolation of protein in high 
purity and concentration; target protein protected 
from proteases; desirable for production of proteins 
that, if active, are lethal to host cell 

Higher protein yields 

Simpler plasmid constructs 



Disadvantages 

Inclusion bodies: protein insolubility; refolding to 
regain protein activity; refolded protein may not 
regain its biological activity; reduction in final 
protein yield: increase in cost of goods 



Reducing environment: does not facilitate disulfide 

bond formation 
Authenticity; N-terminal methionine 
Proteolysis 



Purification is more complex (more protein types) 



Lower growth temperature 

Cold shock promoters (lower temperature) 

Selection of different E. coli strains 

Amino acid substitutions 

Coexpression of molecular chaperones 

Fusion partners 

Strains deficient in thioredoxin reductase 
Sorbitol and glycyl betaine in culture medium 
Altered pH 

Sucrose, raffinose in growth medium 
Rich growth media 

Strains deficient in thioredoxin reductase 

Coexpression of methionine aminopeptidase 
Protease-deficient strains 
Mutagenesis of protease cleavage sites 
Hydrophobicity engineering 
Fusion partners 
Fermentation conditions 
Coexpression of phage T4 pin gene 
Coexpression of molecular chaperones 
Fusion of multiple copies of target gene 
Affinity fusion partners (may require cleavage) 



25, 99, 243, 294, 310, 469 



495 

187, 206, 433 
302 

118, 282, 394, 457, 536 

119, 581, 613 
330, 419, 591a 
128, 447 

49 
541 
56 
386 

128, 447 

483. 513 
211 

25, 243. 395 
394 

59,319, 393, 395, 567 

24, 25, 100, 337 

519-521 

180, 489, 581 

512 

419 



Periplasm 
Advantages 

Purification is simpler (fewer protein types) 
Proteolysis is less extensive 



Improved disulfide bond formation/folding 
N-terminus authenticity 

Disadvantages 

Signal peptide does not always facilitate transport: 
protein export machinery overloaded? 



Reduced folding 
Inclusion bodies mav form 



Protease-deficient strains 

Fusion partners 

Other approaches as above 



Coexpression of signal peptidase 1 

Co-overexpression of prIF 

Use of prIF mutant strains 

Coexpression of pr\A4 and secE 

Expression of pspA 

Coexpression of sec genes 

Fusion proteins 

Amino acid substitutions 

Coexpression of protein disulfide isomerase 

Coexpression of molecular chaperones 

Lower growth temperature 

Sucrose, raffinose in growth medium 



416 

372, 373 
230 



570 
379 
525 
435 
311 
581 
220 
314 

268, 433a 
45 

57. 58, 82 
56 



Inner membrane 

To date not useful for high-level gene expression: may 
facilitate pharmacological studies, enzymatic activity 
studies, and other applications 



220. 500 
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universal mRNA stabilizer. For example, the replacement of 
the 3'-terminaI hairpins of labile mRNAs with those from 
stable mRNAs did not enhance the expression of the labile 
transcripts (38, 89. 597). Furthermore, it has been suggested 
that gene expression might be enhanced by the use of host 
strains that are deficient in specific RNases, such as RNase II 
or PNPase. This, too, is unlikely to be an effective strategy, 
because the absence of RNase II or PNPase. as well as the 
overproduction of RNase II, had no effect on the average 
half-life of E. coli bulk mRNA (138, 139). Moreover, strains 
that were deficient in both RNase II and PNPase were inviable 
(138) These and other considerations led to the following 
conclusions: "It is unlikely that the disparate stabilities of most 
mRNAs that end in a stem-loop result from differential sus- 
ceptibility of these terminal stem-loops to penetration by 3' 
exonucleases," and, furthermore, "3'-exonucleolytic initiation 
of RNA decay probably is rare, except in the case of labile 
RNAs lacking a substantial 3' hairpin and long-lived RNAs 
resistant to attack by all other types of ribonucleases v (35). 

Translational Termination 

The presence of a stop signal in the mRNA is an indispens- 
able component of the translation termination process. In ad- 
dition to the three termination codons, UAA. UGA, and 
UAG, this complex event involves specific interactions be- 
tween the ribosome, mRNA. and several release factors at the 
site of termination (112. 553). In E coli, RF-1 terminates 
translation at the UAG stop codon, RF-2 terminates transla- 
tion at the UGA codon, and both RFs terminate translation at 
the UAA codon (507). An additional factor, RF-3, has recently 
been cloned (219, 377). 

The design of expression vectors frequently includes the 
insertion of all three stop codons to prevent possible ribosome 
skipping. In E coli, there is a preference for the UAA stop 
codon (508). A statistical analysis of more than 2,000 E. coli 
genes revealed local nonrandomness both in the stop codon 
and in the nucleotide immediately following the triplet (445. 
553) The same workers tested the strengths of each of 12 
possible tetranucleotide "stop signals* 1 (UAAN. UGAN. 
UAGN) by an in vivo termination assay that measured termi- 
nation efficiency by its direct competition with frameshifting. 
Termination efficiencies varied significantly depending on both 
the stop codon and the fourth nucleotide, ranging from 80 7 C 
(UAAU) to 7% (UGAC). These findings indicate that the 
identity of the nucleotide immediately following the stop 
codon strongly influences the efficiency of translational termi- 
nation in E. coli (445). Therefore, UAAU is the most efficient 
translational termination sequence in E. coli. 

The sequence context at the 5' end of the stop codon further 
influences the efficiency of termination. Thus, the charge and 
hydrophobicity properties of the penultimate (-2 location) 
C-terminal amino acid residue in the nascent peptide cause up 
to a 10-fold difference in UGA termination efficiency, whereas 
termination at UAG is less sensitive to the nature of the -2 
amino acid residue (389). For the -1 location, a-helical. 
p-strand. and reverse-turn propensities are determining factors 
in UGA termination (48). 

PROTEIN TARGETING 

Cytoplasmic Expression 

The formation of inclusion bodies remains a significant bar- 
rier to gene expression in the cytosol. Inclusion bodies do offer 
several advantages (Table 2). However, these are small conso- 
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lation considering the arduous task of refolding the aggregated 
protein (469), the uncertainty of whether the refolded protein 
retained its biological activity, and the reduction in yield of the 
refolded and purified protein. To date, the precise physico- 
chemical parameters that contribute to the formation of inclu- 
sion bodies remain unclear (322, 363, 381, 469, 495, 588) A 
statistical analvsis of the composition of 81 proteins that do 
and do not form inclusion bodies in E coli concluded that six 
parameters are correlated with inclusion body formation: 
charge average, turn-forming residue fraction, cysteine frac- 
tion, proline fraction, hydrophilicity, and total number of res- 
idues (591). The first two parameters are strongly correlated 
with inclusion body formation, while the last four parameters 
show a weak correlation. These findings were used to develop 
a model to predict the probability of inclusion body formation 
solely on the basis of the amino acid composition of a P r p te J n 
(591). This model was used to predict accurately the insolubil- 
ity of the human T-cell receptor Vp5.3 in E. coli (9). 

Several experimental approaches have been used to mini- 
mize the formation of inclusion bodies and improve protein 
folding (496) (Table 2). These include the growth of bacterial 
cultures at lower temperatures (77, 495, 497 T 517); the selec- 
tion of different E coli strains (302); the substitution of se- 
lected amino acid residues (118, 457); the coproduction of 
chaperones (8, 29, 52, 337, 613); the use of E. coli thioredoxin 
either as a fusion partner (330) or coproduced with the protein 
of interest (613); growth and induction of the cells under os- 
motic stress in the presence of sorbitol and giycyl betaine (49); 
addition of nonmetabolizable sugars to the growth medium 
(56); alteration of the pH of the culture medium (541); and the 
use of strains deficient in thioredoxin reductase (128, 447). 

The reducing potential of the cytoplasmic redox state (156, 
270) presents still another problem. Bacterial cytoplasmic pro- 
teins contain few cysteine residues and few disulfide bonds 
(156 444) Most proteins that contain stable disulfide bonds 
are exported from the cytoplasm (559). Thus, mammalian pro- 
teins whose complex tertiary structure depends in part on di- 
sulfide bond formation may not be produced in their correct 
conformation in the bacterial cytoplasm (443). Bardwell et al. 
have proposed that the low frequency of disulfide bonds in 
cvtoplasmic proteins may be due to the absence from the 
cytoplasm of a system for the formation of disulfide bonds, 
such as the DsbA and DsbB proteins (26, 27), and/or a mech- 
anism that actively prevents the formation of disulfide bonds in 
the cytoplasm. Mutant E. coli strains that allow the formation 
of disulfide bonds in the cytoplasm were isolated (128). These 
mutations inactivate the trxB gene that encodes thioredoxin 
reductase (168) and contributes to the sulfhydryl reducing po- 
tential of the cytoplasm (258). Thioredoxin itself was unneces- 
sary for disulfide bond formation (128). The precise sequence 
of events is not clearly understood, and the authors suggested 
that the cytoplasm may contain another thioredoxin-like pro- 
teia that can be reduced by thioredoxin reductase; in the ab- 
sence of thioredoxin reductase, the oxidized form of this un- 
known protein facilitates the formation of disulfide bonds in 
the cytoplasm (128). Other workers have recently used E. coli 
strains carrving null mutations in the trxB gene and observed 
significant amounts of functional disulfide-contaming protein 
in the cvtoplasm (447). These thioredoxin reductase-deficient 
strains should prove to be valuable tools for the production ot 
complex proteins in E coli. 

The cvtoplasmic expression of a gene without a leader re- 
quires the presence of an initiation codon. the most common 
one encoding methionine. Although this extraneous amino 
acid mieht have no adverse effect on the protein synthesized, 
there are specific cases in which the extra methionine has 
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gene expression is based on the proposition that the rate of 
protein folding will be only slightly affected at about 15 to 20°C 
whereas the rates of transcription and translation, being bio- 
chemical reactions, will be substantially decreased. This in 
turn, will prov.de sufficient time for protein refolding, yielding 
active proteins and avoiding the formation of inactive protein 
aggregates, i.e., inclusion bodies, without reducing the final 
yield of the target protein (433). It would be interesting to 
compare the transcriptional activities of other promoters de- 
rived from cold shock genes (288, 402). 

Other promoters that have been characterized recently (Ta- 
ble I) possess attractive features and should provide additional 
options for high-level gene expression systems. For example 
the pH promoter (102, 561) is very strong: recombinant pro- 
teins are produced at levels of up to 40 to 50% of the total 
cellular protein (480). This expression level, however, will 
probably vary for different genes, because protein synthesis 
depends on translational efficiency as well as promoter 
strength. 

E. coli promoters are usually considered in terms of a core 
region composed of the -10 and -35 hexameric sequences 
including a 15- to 19-bp spacer between the two hexamers 
(344). However, it has been proposed that elements outside 
the core reg.on stimulate promoter activity (134). Many studies 
have demonstrated that sequences upstream of the core pro- 
moter increase the rate of transcription initiation in vivo (172 
213 264, 290, 618). Course and colleagues have shown that a 
DNA sequence, the UP element, located upstream of the -35 
region of the E. coli rRNA promoter rmB PI, stimulates tran- 
scription by a factor of 30 in vitro and in vivo (290, 453 468) 
The UP element functions as an independent promoter mod- 
ule because when it is fused to other promoters such as lacUVS 
it stimulates transcription (453, 468). Upstream activation in E 
coli and other organisms has been reviewed in detail (1 10) The 
ability of the UP element to act as a transcriptional enhancer 
when fused to heterologous promoters may be of general util- 
ity in high-level expression systems. 

Although the extraordinary strength of the rRNA promoters 
PI and P2 .swell documented (173,414), these promoters have 
not been exploited for the high-level production of proteins in 
t. coli, mainly because their regulation is more difficult. The in 
vivo synthesis of rRNA is subject to growth rate control (213) 
and PI and P2 are active during periods of rapid cell growth 
and are downregulated when cells are in the stationary phase 
of growth. Therefore, the rRNA promoters would be contin- 
uously active or "leaky" during the preinduction phase. In vivo 
VI is the weaker, less inducible promoter in rapidly growing 
cells. However, when uncoupled from PI. the P2 promote? 
shows increased activity (up to 70% of that of PI ) and becomes 
sensitive to the stringent response, indicating that in its native 
tandem context, P2 is partially occluded (173, 289). Brosius 
and Holy (66) inserted the lac operator sequence downstream 
ot the rmB rRNA P2 promoter and achieved repression of P2 
in strains harboring the lad* gene. Transcriptional activity was 
measured by the production of chloramphenicol acetyltrans- 
ferase and by the expression of the 4.5S RNA. However the P2 
construction was only half as active as the lac promoter and 
furthermore, when the rrnB PI promoter was placed upstream 
of the P2 promoter, transcriptional repression was incomplete 
(66). 

It is tempting to speculate that rRNA promoters could be 
tightly regulated by using the concept of inverted promoters 

see the section on tightly regulated expression systems be- 
low). Thus, a rRNA promoter could be cloned upstream of the 
[gene of interest but in the opposite transcriptional direction 

I he use of K integration sites and a regulated \ integrase 
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would facilitate the inversion of the promoter for induction 
and the presence of strong transcription terminators upstream 
of the highly active promoter would prevent destabilization of 
the vector during the preinduction phase. 

Transcriptional Terminators 

In prokaryotes, transcription termination is effected by two 
different types of mechanisms: Rho-dependent transcription 
termination depends on the hexameric protein rho, which 
causes the release of the nascent RNA transcript from the 
template In contrast, rho-independent termination depends 
on signals encoded in the template, specifically, a region of 
dyad symmetry that encodes a hairpin or stem-loop structure in 
the nascent RNA and a second region that is rich in dA and dT 

lie !Lf?£ d Al° 9 bp distal t0 the d y adic sequence (83, 122, 
439, 455, 456, 465, 594, 609). Although often overlooked in the 
construction of expression plasmids, efficient transcription ter- 
minators are indispensable elements of expression vectors be- 
cause they serve several important functions. Transcription 
nrough a promoter may inhibit its function, a phenomenon 
known as promoter occlusion (5). This interference can be 
prevented by the proper placement of a transcription termina- 
tor downstream of the coding sequence to prevent continued 
transcription through another promoter. Similarly, a transcrip- 
tion terminator placed upstream of the promoter that drives 
expression of the gene of interest minimizes background tran- 
scription (413). It is also known that transcription from strong 
promoters can destabilize plasmids as a result of overproduc- 
tion of the ROP protein involved in the control of plasmid copy 
number as a result of transcriptional readthrough into the 
replication region (539). In addition, transcription Terminators 
enhance mRNA stability (237, 404, 597) and can substantially 
increase the level of protein production (237, 572). Particularly 
effective are the two tandem transcription terminators Tl and 
T2, derived from the rmB rRNA operon of E. coli (67) but 
many other sequences are also quite effective. 



Transcriptional Antiterminators 

In bacteria, many operons involved in amino acid biosynthe- 
sis contain transcriptional attenuators at the 5' end of the first 
structural gene. The attenuators are regulated by the amino 
acid products of the particular operon. Thus, the availability of 
the cognate charged tRNA leads to the formation of a second- 
ary structure in the nascent transcript followed by ribosome 
stalling. In the absence of the cognate charged tRNA an an- 
titerminator structure which prevents formation of the RNA 
hairpin in the terminator and prevents transcriptional termi- 
n S l °W? Tm f (325) - The antiterminator element that en- 
ables RNA polymerase to override a rho-dependent termina- 
tor in the ribosomal RNA operons has been identified and is 
referred to as boxA (41. 341). Transcriptional antitermination 
is a remarkably complex process that involves many known and 
as yet unidentified host factors. This topic has been covered in 
great detail in two excellent recent reviews (1 10, 456) Here we 
will briefly consider the use of antitermination elements that 
are useful in the expression of heterologous genes in E. coli 

Une of the more powerful and widely used expression sys- 
< C Z S tu co f'. mak « use of th e Phage T7 late promoter (537, 
548). The activity of this system depends on a transcription unit 
that supplies the T7 RNA polymerase, whose tight repression 
is essential to avoid leakiness of the T7 promoter. Several 
approaches have been used to regulate the expression of the 

/7-,iT^ meraSe ' and each has its own uni 'l ue disadvantages 
(374). Mertens et al. (374) addressed this problem by con- 
structing a reversibly attenuated T7 RNA polymerase expres- 
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Promoter (source) 
lac (£. coii) 



TABLE 1. Promoters used for the high-level expression of genes in E. coti 



trp (£• coli) 
Ipp (£. coli) 
phoA (£. coli) 
recA (£• coli) 
araBAD (£. coli) 
proU (£. coli) 
est- 1 (£. coli) 
tetA (£. coli) 
cadA (£. coli) 
nar (£. co//) 
we, hybrid (£. a?//) 

frc, hybrid (£. a>//') 

/pp-toc, hybrid (£. coli) 
P sy m synthetic (£. coli) 
Starvation promoters (£. coli) 

PlW 

p L -9G-50, mutant (X.) 

(£• co/() 
Pr>Pl» tandem (X) 
T7 (T7) 

T7-/ac operator (T7) 
\p u Pt7' tandem (\, T7) 
T3-/ac operator (T3) 
T5-/ac operator (T5) 
T4 gene 32 (T4) 

nprM-lac operator {Bacillus spp.) 

VHb (Vitreoscilla spp.) 

Protein A (Staphylococcus aureus) 



Regulation 

lacL lad* 
/flc/(Ts) - /ac/ q (Ts) fl 
/nc/(Ts) ft 



p/ioB (positive), p/ioK (negative) 

lexA 

araC 



cadR 

fnr (FNR, NARL) 
lad d 

lad, lad* 

lad(ls), a tad*(Ts)" 
lad 

lach lad* 
kcltsSSl 



\cltt857 
Arltt857 
/acf* 

\rltt857, /ac/ q 
/ac/ 4 

tec/ q , lad 
lad* 



IPTG 
Thermal 
Thermal 

Trp starvation, indole acrylic acid 

IPTG, lactose c 

Phosphate starvation 

Nalidixic acid 

L-Arabinose 

Osmolarity 

Glucose starvation 

Tetracycline 

PH 

Anaerobic conditions, nitrate ion 

IPTG 

Thermal 

IPTG 

Thermal 

IPTG 

IPTG 

Thermal 

Reduced temperature (<20°C) 

Reduced temperature (<20°C) 

Thermal 

Thermal 

IPTG 

Thermal, IPTG 

IPTG 

IPTG 

T4 infection 
IPTG 

Oxygen, cAMP-CAP* 
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17, 18, 221, 460, 610 
234 
604 

365, 470, 549, 612 

128a, 142, 185, 275, 401 

84, 274, 291, 306, 382, 562 

145, 260, 428, 516 

554 

247 

564 

125, 523 
102, 480, 561 
335 

7, 123, 471 

603 

65 

4,9 

261, 263 

186 

366 

43, 80, 129, 130, 240, 454 
187, 433 

187, 206, 433, 551 
150, 493 
537, 548 
141, 190, 239 
375 

190, 605 
71, 390 
143, 210 
605 

304, 305 
1,256, 349 



a lad gene with single mutation, Gty-187 Ser (72). 

expression occurs only in the presence of a lac inducer (142). 
d Wild-type 'ac/ gene. 

' cAMP-CAP, cyclic AMP-catabolite activator protein. 



Lanzer and Bujard carried out extensive studies on the com- 
monly used toe-based promoter-operator systems and demon- 
strated up to 70-fold differences in the level of repression when 
the operator was placed in different positions within the pro- 
moter sequence (328). Thus, when the 17-bp operator was 
placed between the -10 and -35 hexamenc regions, a 50- to 
70-fold-greater repression was caused than when the operator 
was placed either upstream of the -35 region or downstream 
of the - 10 site (328). . . 

A third important characteristic of a promoter is its lnduc- 
ibility in a simple and cost-effective manner. The most widely 
used promoters for large-scale protein production use thermal 
induction (\ pj or chemical inducers (trp) (Table 1) The 
isopropyl-P-D-thiogalactopyranoside (IPTG -induc.ble hybrid 
promoters tac (123) or trc (65) are powerful and widely used 
for basic research. However, the use of IPTG for the large- 
scale production of human therapeutic proteins is undesirable 
because of its toxicity (159) and cost. These drawbacks of IPTG 
have until now precluded the use of the tac or trc promoter 
from the production of human therapeutic proteins and ren- 
dered the large-scale expression of proteins for basic research 
prohibitively expensive. The availability of a mutant lacl( Is) 
gene that encodes a thermosensilive lac repressor (72) now 
permits the thermal induction of these promoters (4, 9 234). in 
addition, the new vectors exhibit tight regulation of the trc 



promoter at 30°C (9). Two d.fferent lac repressor mutant that 
are thermosensilive (586, 604) as well as IPTG inducible (586) 
have recently been described. Although the wild-type lad gene 
can be thermally induced (602, 603), this system is not tightly 
regulated and cannot be used in lacF strains, since a temper- 
ature shift does not override the tight repression caused by the 
overproduction of the lac repressor (603). Thus, th.s system is 
limited to the production of some proteins that are not detri- 
mental to the host cell. 

Cold-responsive promoters, although much less extensively 
studied than many of the other promoters included here, have 
been shown to facilitate efficient gene expression at reduced 
temperatures. The activity of the phage X p L promoter was 
highest at 20°C and declined as the temperature was raised 
(187) This cold response of the p L promoter is positively 
regulated bv the E. coli integration host factor, a sequence- 
specific, multifunctional protein that binds and bends DNA 
(164 165 188). The promoter of the major cold shock gene 
csoA P06. 551) was similarly demonstrated to be active at 
reduced temperatures (187). Molecular dissection of the cspA 
and p. promoters led to the identification of specific DNA 
regions involved in the enhancement of transcription at lower 
temperatures: this has allowed the development of p^ deriva- 
tives that are highly active at temperatures below 20 C (4JJ). 
The rationale behind the use of cold-responsive promoters tor 
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EXPRESSION VECTORS 
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INTRODUCTION stabinty and translatjona| effidency Qf mRNA) ^ ^ of 

The choice of an expression system for the high-level pro- pr ° te ' n ding ' de g radat 'on of the protein by host cell pro- 
duction of recombinant proteins depends on many factors S ' "} ai ° T . dlfferenc « in codon usage between the foreign 

These include cell growth characteristics, expression levels' f "! u na ^ Ve Em coli ' and the P ot ential toxicity of the protein 
intracellular and extracellular expression, posttranslational f hoSt " F° rtunat ely, some empirical "rules" that can guide 

modificat.ons, and biological activity of the protein of interest T u g " ex P ression systems and limit the unpredictability 
as well as regulatory issues in the production of therapeutic e £ ' S °P erat,on ln K coli have emerged. The major drawbacks 

proteins (191, 254). In addition, the selection of a particular f 35 T ex P ression astern include the inability to per- 

expression system requires a cost breakdown in terms of pro- l? m m3ny e P osttran slational modifications found in eu- 

cess, design, and other economic considerations. The relative ^° Uc P roteins ' tn e lack of a secretion mechanism for the 

merits of bacterial, yeast, insect, and mammalian expression r ? ent [ e ease °/ protein into the cul twe medium, and the 

systems have been examined in detail in an excellent review by o I ab / lltv L to facil 'tate extensive disulfide bond formation. 
Marino (362). In addition, Datar et al. (121) have analyzed the ,. hand ' manv euka ryotic proteins retain their full 

economic issues associated with protein production in bacterial ° lolo &™ a <;t'vity in a nonglycosylated form and therefore can 

and mammalian cells. oe produced in E. coli (see, e.g., references 170, 342 and 486) 

The many advantages of Escherichia coli have ensured that it l " add '|'° n - some Progress has been made in the areas of 

remains a valuable organism for the high-level production of ex f[ acellular secretion and disulfide bond formation, and these 
recombinant proteins (177a, 197, 254, 362 406 426 510) examined. 

However, in spite of the extensive knowledge on the genetics T, he u°^f C ' iveS ° f this review are to integrate the extensive 

and molecular biology of E coli, not every gene can be ex- P ubllsned "terature on gene expression in E. coli, to focus on 

pressed efficiently in this organism. This may be due to the ex P ression systems and experimental approaches useful for the 

unique and subtle structural features of the gene sequence the over Production of proteins, and to review recent progress in 

this field. Areas that have been covered in detail in recent 

. .. ... „ T! ev ? are '"eluded in abbreviated form in order to present 

*nJr' n V,n De P artmem of Molecular Biologv. T Cell Sci- tne,r ke - v conclusions and to serve as a source for further 



